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ABSTRACT 


An algorithm is developed for estimating the state of a linear dy- 
namic system excited by a random sequence. The input data are noisy 
observations which are nonlinear functions of the state. The estimates 
are best in the sense of least squared residuals. A significant problem 
in radar tracking is investigated and the effectiveness of the algorithm 


verified. 
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les Introduction. 

The problem to be considered is that of cmaite estimation where the 
observations are nonlinear functions of the bate of the system corrupted 
by additive white noies. The equations of motion of the state are lin- 
ear in the state and the excitation which igeelegee random caieteecieite orsueah 
significant constraint on the solution is that the oaeaectatel alent 
to this problem must lead to an ea ici procedure which can be cea d 
on a digital computer and the scheme must produce estimates as the obser- 
vations are received. 

The theory for the case where system dynamics and observation func- 
tions are linear is very highly developed. However, when AGI deed ies 
are introduced there are virtually no completely satisfactory solutions. 

Two methods for handling the nonlinear problem have been introduced. 
The first entails a linearization about a nominal trajectory in state 
space. Its success depends upon the accuracy of the nominal trajectory. 
This technique has little hope on eaeees in a situation where there is 
almost no prior information about the trajectory. The target acquisition 
problem is an example of such a situation. 

The second method is more of theoretical interest than as a candi- 
date for a computational procedure. In this development the viewpoint is 
taken that the output of a state estimator should be the conditional pro- 
bability density function, conditioned upon all past data. Computational 
difficulties arise from an effort to compute a complete function over He 
entire state space as compared to a more conventional estimator which 
selects a single point in the state space as the most likely Stace 

This study was particularly motivated by the difficulties encountered 


in filtering radar returns from an airborne target. The target dynamics 


are presumably describable by a linear dynamic system where the elements 
of the state vector are the position and velocity of the target in car- 
tesian coordinates measured from the radar. On the other hand, data avail- 
able to a filter from the radar will usually be in spherical coordinates. 
Thus there exists a known, nonlinear relationship or transformation be- 
tween the state of the system and observations. The results of a series 
of experiments using a filter based upon a simple linearization procedure 
are reported in Demetry and Hudson [4]. The operation of the filter was 
unsatisfactory under realistic conditions of initial uncertainty about 
target position, i.e., where to evaluate the partials involved in the 
linearization procedure. 

The present work fills a gap in the field of nonlinear estimation for 
problem in which there is little prior information and a computationally 
feasible estimation procedure is required. 

The problem is discussed and precisely formulated in Section 2. The 
original filtering problem is replaced by an associated minimization prob- 
lem. The work of Kalman is reviewed since it is known that the Kalman 
filter is the solution to the associated minimization problem when the 
observations are linear functions of the states. A special single-stage 
form of the problem is considered. It is shown that the nonlinear obser- 
vation problem can be approximated by a linear problem whose solution is 
known from Kalman's work. The resulting solution is then used to generate 
a new approximation to the original problem. Each iteration of the above 
process reduces the function to be minimized so long as the gradient is 
non-zero. It is shown that the whole class of problems originally con- 
sidered can be cast in the special single-stage form. 

The problem may be said to be solved at this point; however, for real- 


time calculations there must be procedures for controlling the number 
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of iterations. Iteration control procedures are discussed along with 
heuristic means for choosing the control parameters. Even with the num- 
ber of iterations kept to a finite number, the computing requirements 


would increase indefinitely since each new set of data generates a new 


minimization problem over a larger number of variables. The Sones of 
this effect is discussed by introducing the concept of noise generated 
by the nonlinearities. The overall algorithm is discussed with respect 
to implementation on a digital computer. 

The method was used on a realistic radar tracking problem. The 
models for the target and the radar are discussed. The results of the 


study indicate that the method produces reasonable estimates of the states 


of the system. 


ll 


Di Detailed Statement of the Problem. 

The specification of the problem may be viewed as consisting of four 
parts. Three of these are different types of information about the system 
under observation and the last is an implicit statement of how the avail- 
able information should be combined to form an estimate of the state of 
the system. 

The first type of information about the system is expressed by form- 
ing a dynamic model of the system. In this way the relation between 
states at different times is made explicit. It is only through the dy- 
namic relation of the states that observations taken at diverse times 
have any relation to one another. In general usage the term filter is 
associated with a sequence of observations which are related to one 
another (correlated). The states and the dynamic model embody this cor- 
relation. 

The second type of information is the relationship of the observa- 
tion at a given time to the state at the same time. 

The third type of information is the a priori knowledge about the 
state of the system. 

Finally the best estimate of the state is defined. In general, none 
of the information about the state is definitive; it is all subject to 
some uncertainty and except for this uncertainty, it would be contra- 
dictory. The best estimate is defined to effect a certain compromise 
among all of the available information. 

Since the resulting definition is implicit, computational difficul- 
ties occur. The resolution of these difficulties results in an explicit 
procedure for producing an estimate which is almost best within a reason- 


able computing time. Such an explicit procedure, or algorithm, will be 
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called a filter. 


The Dynamic Model 

The time evolution of the states of the system which is being observed 
are assumed to be adequately described by a difference equation which is 
linear in both the state and the excitation. The excitation is assumed to 
be a white random time sequence which has known first and second moments 
and is independent of the state. If the mean of the random excitation is 
not zero, then the random signal can be decomposed into a deterministic 
component and a zero-mean random component. The development will assume 
that there is no deterministic component (or non-zero mean). A short come 
ment will be made at the appropriate point outlining the required changes 
tor the case of deterministic inputs. These notions are concisely stated 


in (1) through (4). 


x(ktl) = & x(k) + T w(k) (1) 
Efw(k)] = 0 (2) 
Efw(k) w(i)"] = 15 fon wr (3) 
E[w(k) x(j)'] = 0 for ksj (4) 


where x is the state vector of n components, 

T is a known n x r input distribution matrix, 

w 18 a random excitation of r components, 

G¢ is a known r x n State transition matrix, 

Ef ] is the expectation operation, and 

T denotes the transpose. 
Equation (1) expresses the linearity of the state dynamics while (2), (3), 
and (4) express the qualities of zero mean, whiteness, and independence 


respectively about the excitation. The assumption that the covariance 
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of w(k) is the identity matrix involves no loss of generality as long as 
T is appropriately chosen. Consider the two random excitations ['w and 
IT'w' where 

E(w] = 0 

Efw'] = 0 

Ef w w?] = I 

E[w' wt) = Q 
By comparing first and second moments, the two random excitations are 
equivalent if rr = r'or'!. Q is a covariance matrix and thus is sym- 
metric and positive semi-definite. This implies that a decomposition can 


be found such that B pt = Q, and[ «w= I['B. 


The Observations 
The data available to the filter are nonlinear functions of the 
state corrupted by additive white noise. The functional relation is 
assumed to be twice differentiable in the state. The corrupting noise 
is assumed to have zero mean and known variance. The noise is assumed 
to be independent of the states and the excitation. These notions are 


concisely expressed as 


2(k) = h(e(k)) + v(k) (5) 
Efu(k)] = 0 (6) 
Efv(k) ¥(3)"] = 16 foe (7) 
E[v(k) x(j)"] = 0 (8) 
E[v (k) wi)" = 0 (9) 


where z(k) is an m vector of observations at time k, 
v(k) is an m vector of noise at time k, 


h( ) is an m vector of nonlinear functions of the states 
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R is the m x m covariance matrix of the measurement noise. 


A Priori Information 

In view of (1) any information relative to x(k) must also be consid- 
ered when estimating x(Ktl). In conventional linear sequential stage-by- 
stage estimation one can consider two distinct phases. The first is to 
bring forward all information from the past observations in the form of 
an a priori estimate using (1). The second phase is then to adjust this 
a priori estimate in view of the actual observations of the state. This 
is repeated from stage to stage. Clearly this process must start with a 
given a priori estimate for the first state. Sometimes this a priori 
estimate serves merely as a mathematical convenience for starting the fil- 
ter [9]. In other cases an appropriate a priori estimate is really avail- 
able. In any case an a priori estimate takes the form of an estimate of 
the initial state coupled with a measure of the accuracy of this estimate, 
the covariance of the error, defined as 


BL (x(1)-%(1/0)) (e(1/0))"] = P(1/v) (10) 


E(x(1)-x%(1/0)] = 0 (11) 
where x(1/U) is the a priori oalliteade of ¥(1), and 

P(1/0) is the covariance of the a priori estimate. 
The double index argument will be used throughout to indicate an estimate 


of the state associated with the first index based on observations up to 


and including those associated with the second index. 


Definition of Best Estimate 
The best estimate of sequence of states x(1) through x(k) will be 
called x(1/k) through x(k/k) and is defined as that sequence which mini- 


mizes the scalar quantity 
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Cx (1/k), 2(2/k), «oe, x(k/k)) = || x(1/k) - x (1/0) IW, 


k 

+) || xCi/k) = @ x(4-1/k) ile 
in 2 : 
k 


+ . | 2(i) - nO (i/k))||¢ (12) 
i=l e 


where the norm notation is introduced for compactness. By definition 
cin is equivalent to b Ab and the result is a scalar which is a quad- 
ratic function of the elements of b. 

A best estimate defineuin this manner might be called a weighted 
least-squares estimate generalized to include the case where the quantity 
to be estimated is changing somewhat randomly in time. 

The three different types of terms correspond to (10), (1), and (5) 
respectively. The terms inside the vertical bars are called residuals. 
The residuals are the difference between the expected value of a function 
of the true states and that same function evaluated at the estimate of 
the state. 

This definition of the best estimate can be interpreted in several 
ways which will be developed below. These interpretations cannot in any 
sense prove that estimates defined in this way are best. The most that 
can be hoped for is that the interpretations offered will enhance the- 
reasonableness of the resulting estimate. It must be realized that the 
best estimate is only what it is defined to be, which, in the final anale- 
ysis is certainly somewhat arbitrary. 

First, if all of the random sequences are assumed to be sequences of 
Gaussian (or normal) random variables, then it is possible to compute the 
probability of the observed data for a given sequence of states. This 


probability is viewed as a function of the true states. That sequence of 
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the states which maximizes the probability is taken as an estimate of 
the true states. The probability is commonly called the likelihood and 
can be expressed as 

L(a(1), 2(2), 02, 2(K)) = K exp [- 5 C(x(1), (2), .-05 200K) 
where K is a constant which is independent of the states if the weighing 


matrices are chosen as 


Ww, > P(i/o)”} 3713) 

Wo Tien (14) 
= 

Ww, (15) 


The covariances are assumed to be nonsingular for simplicity. For 
a detailed discussion of the case of singluar covariance matrix see 
Appendix I. It is clear that to maximize L one must minimize C. Thus 
for the case of Gaussian random variables the best estimate will be the 
so called maximum likelihood estimate. 

A second point of view is that it would be desirable to find an 
estimate which resulted in zero residuals; requiring zero residuals, how 
ever, would imply a larger number of constraints than there are adjustable 
parameters (estimates). This suggests that the best estimate would be 
some sort of compromise where all of the residuals are small. Such a com- 
promise is effected by setting up a weighted sum of squares of the re- 
siduals, C, as a function of the estimates, and selecting (or defining 
in this case) that set of estimates which minimize C as the best esti- 
mate. 

In general, the best estimate will depend upon the weighting chosen 
for each residual. In order to determine the appropriate weighting mat- 
rices it is helpful to consider under what circumstances equal weighting 


would be appropriate. A heuristically reasonable answer might be to 
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weight equally when each of the random components have the same variance. 
If each of the sets of equations are multiplied by appropriate matrices 
new random variables can be defined so that each has unit variance. The 
residuals from these new equations are computed and their squares all 
weighted equally. In this form the residuals have an intuitively rea- 
sonable weighting. But it can be shown that this is equivalent to 
weighting the original residuals with the inverse of the corresponding 
covariance. 
A very simple example should clarify the argument. Consider the 

problem of estimating x from two observations 

Z,= h, Ge) + vy 

z= h,@) ‘+ Vv. 
where Ef, ] «= 0, 

Efv.] = 0, 


2 Z 
Ev, ] = Oo)? 


2 2 
Ev] = 05° 
Dividing the first equation by oO, and the second by 0» yields 
. 2 F 
z,/o, h, (c/o, us 


and z,/9, = h, (x)/o, + u, 


where u, and u, are unit variance random variables. So the sum of squared 


residuals is formed 
Cx) = @,/o, - by @ )/o,)° + alo, ~ bye)/o,)" 


but this is equivalent to 


CO) = yn ()) oy + Gy - by&))*/o5 


18 


Therefore, if it is reasonable to weight equally residuals associated 
with random variables of equal variance then it follows for unequal vari- 
ances the weighting should be in inverse proportion to the variance. 

1? Wo and W, 


are chosen directly without recourse to any assumed random variables. 


Another point of view is that the weighting matrices W 


There is, of course, an equivalent problem cast in terms of random vari- 
ables. It is the author's view that it is easier to assessthe magnitude 
of the variances of the random variables than to choose an appropriate 
set of weighting matrices directly. 

A general model of the underlying physical process which generates 
the data has been presented. For any real physical situation the para- 
meters ©, IT, A, x(1/0) and P(1/0) will be numerical quantites and the non- 
linear functions h(¥%) will be a vector of explicit functions. This is the 
type of information which a filter designer must have before the filter 
can be constructed. For a given model there are many possible filters 
which might be considered; in each case presumably the output of the fil- 
ter would be best in some sense, often unspecified. The filter under 
consideration in this work is defined by the sense in which the estimates 
are best, i.e., the filter is a least-squared-residuals filter. Thus, it 
should be noted that the filtering problem has been transformed into a 
sequence of minimization problems. It will turn out that the solution to 
each minimization problem is itself a sequence of solutions to a much 
simpler minimization problem, namely the problem with linear observation 
functions. The solution to this simpler problem is well known and is 
discussed in the next section along with other filter techniques where 


the model is similar to the one described in this section. 
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3. Prior Work. 

The field of state estimation has received a great deal of interest 
recently, especially since the work of R. E. Kalman [7] and R. E. Kalman 
and R. Bucy [8] in linear estimation theory. For a comprehensive survey 
of the general field of estimation Deutsch [6] is suggested. Lee [9] 
presents a fine treatment of the theory, particularly with respect to the 
relationship between control and estimation theory. Cox [2, 3] is a 
specialized review of the efforts in the area of nonlinear estimation of 
which this work is a special case. 

The structure of the problem and the results obtained by Kalman [7] 
along with several extensions, modifications and alternate interpretations 
are discussed in some detail. This discussion provides a convenient refer- 
ence for comparison of the results of this thesis as well as an opportunity 
to establish certain known results which will be needed in the development 
of the method that follows. The method of trajectory linearization is dis- 
cussed and it is shown how a natural extension leads to the method used 


here. 


Kalman Filter 

Kalman (7] has solved a special case of the problem under consider- 
ation where the measurements are linear functions of the state. 
Symbolically 

2(k) = H x(k) + vk) (16) 

replaces (5). | 

Two cases are considered. In the first, all of the random sequences 
are assumed to be sequences of Gaussian random variables. With this as- 
sumption, the estimate is shown to be optimum in the sense that any linear 


function of the estimate is the minimum variance estimate of the same 
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linear function of the true state. The estimate turns out to be a linear 
function of the observations. 

In the second case, there are no assumptions about the form of the 
probability density functions of the random variables, but the estimate 
is assumed to be a linear function of the observations. The optimum 
estimate is defined in the same way. 

In either case the method of computing the optimum estimate (the 
filter) is the same. The sequence of operations can be envisioned as 
consisting of two steps. The first will be called the prediction equation. 

x(k+1/k) = &x(k/k) (17) 

The double argument notation will always indicate an estimate. The left 
side should be read; the estimate of the state at time k+l given data up 
to time k. The second step will be called the adjustment for newly re- 
ceived data. 

xe (kt 1/k+1) = x(ktl/k) + G(k) [2 (k+l) - F x(kt1/k)] (18) 
where [z(k+l1) - H x(k+1/k)] is the error in the predicted observations, 
and G(k) is a matrix of adjustment coefficients. The matrix G(k) reflects 
the relative confidence one should have in the observed data as compared 


to the predicted estimate. This is discussed in [9]. 


G(k) = P(ich1/k) H” [H P(kb1/k) HT + RY" (19) 
Where P(kt1/k) is the covariance of the estimates defined as follows: 
P(ketL/k) = Ef Ge(iebL/k) = xc(ich Gr (Ie 1/k) = ae(iek) (20) 
This formulation then requires that one must keep track of the co- 


variance of the estimates. This is also done in two steps. 


P(k+l/k) = @ P(k/k) & +r (21) 
P(ktl/k+1) = 
P(ktl/k) = P(ktl1/k) HY CH P(ti/k) H! + RY) OF PCL /k) (22) 
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Rauch [13] has derived these same equations based on the Gaussian 
assumption and shown the resulting estimate is the conditional mean and 
the maximum likelihood estimate. 

Lee [9] has shown that the same equations yield a weighted least 
squared residual estimate where the weightings are the inverses of the 
covariance matrices. This also follows from the derivation of the maxi- 
mum likelihood estimate based upon the assumption that the random se- 


quences are Gaussian. 


Trajectory Linearization 
The trajectory linearization method has been used to solve nonlinear 
filtering problems such as orbit determination for artificial satellites 
[10], [11]. It is assumed from physical considerations that the evolution 
of the system state satisfies a general difference equation of the form 
x°(ket1) = £Oc(k), w(k), k) (23) 
and that observations are available in the form 
2(k) = h@&(k), v(k), k) (24) 
where w(k) and v(k) are random vectors. 
It is further assumed that there is a known nominal trajectory which 
is a sequence of states x(k) such that 
x (k+l) = £(°(k), 0, k). <25) 
It is desired to find the best estimate of the deviation of the true 
State from the nominal state. 
Defining 
Y(k+1) w x(ktl) = 3° (ke+1) (26) 
and substituting (23) and (25) into (26) results in 
y(ker1) = £Gc(k), wlk), k) = £6e°%(K), 0, k) (27) 


This relation is now approximated by a first order Taylor series about 
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the point x(k) = x (k) and w(k) = 0. The two partial derivatives are 


given appropriate symbols. 
q (k) = £.G°(k), 0, k) (28) 
T (k) = £,Ge(k), 0, k) (29) 
The first order Taylor pene expansion results in 
y(lerl) = £00%(k), 0, k) + Ck) [x(k) = x°(k)] 
+ T(k) wk) = £Ge°(k), 0, k) (30) 
Substitution of (26) in (30) results in 
y (itl) = &(k) y(k) + T(k) wk). (31) 
From the nominal trajectory it is possible to construct a nominal 
set of observations 
2z°(k) = h@e°(k), 0, k). (32) 
Consider the deviations of the observations about the nominal obser- 
vations. 
u(k) #2z(k) - 2°(k) (33) 
Substituting (24) and (32) in (33) yields 
u(k) = h(x(k), v(k), k) = h@’(k), 0, k) (34) 
The relation is approximated by a first order Taylor series about 
the point x(k) =~ x°(k) and V(k) = 0. The two partial derivatives are 
given appropriate symbols 


H(k) & bh Geo(k), 0: k) (35) 
S(k) = a (@'(k), 0, k) (36) 


u(k) = h&e°(k), 0, k) + Yk) fxe(k) - x°(k)] 
+ S(k) v(k) - hOe(k), 0, k) (37) 
Substitution of (26) in (37) yields 


u(k) = A(k) y(k) + S(k) v(k) (38) 
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Although it was not noted in the description of the Kalman filter, 
it is true that the equations remain valid if any or all of the matrices 
&, H, T, R, are known functions of time. These filter equations are thus 
directly applicable to (31) and (38). 

The purpose of the process of trajectory linearization is to generate 
(31) and (38). It is then noted that with respect to the states y(k) and 
the observations u(k) the model is in the form of a linear dynamic system 
and linear observations. The Kalman filter is then applied directly as 


though (31) and (38) were equalities. 


Nonlinear Noise 

A question naturally arises concerning the adequacy of the first ord- 
er approximation in developing (31) and (38). The heart of this problem 
is investigated by Denham and Pines [5] through the use of a very simpli- 
fied model and a number of Monte Carlo studies. They reach the conclusion 
that the difficulties are of an indirect nature. The first estimates are 
about as good as might be expected. In processing subsequent data, how=- 
ever, trouble develops because the assumed quality of the first estimate 
is too great, which means that the next data get weighted too lightly. 

This effect can be best seen by reconsidering (31). In order to make 
this expression into an equality, all of the higher order terms must he 
added to the right side of the expression. These additional terms should 
be considered as part of the observation noise. It is the failure to 
account for this nonlinear noise in the Kalman filter that causes the dis- 
crepancy between the covariance of the estimates as computed in the filter 
and the true average squared estimation errors. When the calculated co- 
variance overstates the quality of a given estimate, a subsequent obser- 


vation will certainly be combined with the current estimate in a non- 
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optimum manner. 

Denham and Pines point out that when the order of magnitude of the 
expected value of the neglected terms is of the order of the natural 
measurement noise one cannot expect the linearized filter to work proper- 
ly. The expression for the "nonlinear noise" involves the difference 
between the true state and the point about which the linearization takes 
place. If there were some way to reduce this difference then the nonlinear 
noise would be reduced correspondingly. These authors attribute to John 
Breakwell an iterative procedure for accomplishing this. The procedure 
is to linearize and filter, then relinearize at the new estimate and fil- 
ter again. This cycle is repeated until the output of the filter is the 
same as the point at which the linearization takes place. A Monte Carlo 
study using this iterative technique revealed that the computed covariances 
quite accurately reflected the quality of the estimates. 

It will be noted that the method used to solve the least-squared-re- 
sidual problem as developed in the next section is exactly the iterative 


method suggested by Breakwell. 
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4. Development of the Solution Algorithm. 

The development will proceed in two phases, the first being a con- 
ceptual means of finding the absolute best estimate, the second being the 
development of a series of compromises required for computational feasi- 
bility. 

The minimization of (12) will have to be carried out in an iterative 
fashion since the simple process of differentiating and setting to zero 
does not lead to an explicit formula for the state estimates as it would 
if the observation functions were linear. The iterative procedure is 
based upon a linearization of the observation functions. The linearized 
observations are then in the form (16) and minimization is carried out 
using the Kalman filter equations. This produces a set of state estimates 
about which the nonlinear functions can be relinearized. This process is 
repeated until there is no further change in the state estimates. 

The Kalman filter in its normal form is not completely adequate since 
its output is the sequence of estimates x(1/1) through x(k/k) while the 
point about which it is desired to linearize is y(1/k) through x(k/k). 
These latter estimates are called the smoothed estimates. There are for- 
mulas available, due to Rauch [13], for converting the output of the Kalman 
filter into smoothed estimates. There is, however, a more convenient way 
to get the same results in this case where all of the smoothed estimates 
are required. This involves converting from a multistage problem to a 
single-stage problem with a proportionately enlarged state space. The de- 
tails of this conversion will be discussed following the discussion of the 


iteration for the single-stage process. 
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5. The Single Stage Minimization Procedure. 
The equations related to the single-stage process are rewritten here 
using a simplified notation and are numbered using the same numbers as in 


their first appearance with an ' added. 


Z2= h(x) +v (5') 

E[v] = 0 (6"') 

Efvv] =i (7') 

Efux'] = 0 (8') 

Ef (x-x,) Gen)" ] oe (10') 

Efx-x J = 0 (11') 

2 2 : 

C,) = lx, - x I oe: lz - h(x, || Rol (12") 

Oo 


where 
Zz 1s a vector of observations, 
v is the noise in these observations, 
R is covariance of the noise, 
xo is the a priori estimate of the true state x, 


%y is the new estimate of x, and 


re is the covariance of the a priori estimate. 


The pertinent equations from the solution to the linear problem are 


also rewritten here. 


Z=ilh xX + UD (16') 

x) = X + G@-Hx,) (18') 
T T -l ' 

G.=. Pela iepal + FR] (19') 


The non-singularity of Po assumed in (12') is fully discussed in 


Appendix I. It is sufficient here to say that if Po is singular the 
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problem can be reduced to a smaller state space where the associated P. 
is non-singular. It will always be assumed that A is non-singular, i.e., 
the assumption is made that there are no observations of unlimited ac- 
curacy. 

The iterative minimization procedure involves a linear (first-order) 
approximation to (5') about a point x, which will always be the best 
available estimate of x. This linearized approximation is then manipulat- 
ed to form a synthetic observation which has the form (16'). This syn- 
thetic observation is used in (18') and a new approximation to the best 
estimate is obtained. Using this estimate as the point of linearization 
of (5') the process is repeated. These steps are expressed symbolically 


as follows. 
ATW co SC ee (39) 
= l aad Ele l 


where x} is the ith approximation to the best estimate x, 


i 1 


zis 2 -h@}) +H x} (40) 
z-=H x +v (41) 
where z is the so-called Synthetic observation and 
i if 
itl _ 1 ee | 
where 
i - 
Gh = PH CHP HT + RI! (44) 


This completes the description of the pure minimization algorithm 
except for a specification of the initial point of linearization and the 
possibility of overshoot. Discussion on the initial point of linearization 


will follow after the discussion of conversion from multi-stage to single 
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stage. The possibility of an overshoot results from the fact that a 
simple first order approximation to the nonlinearity may not be accurate 
for the subsequent change in X)- The overshoot protection scheme used 
is simply if 

C(x.) 2C (x) 


i+ + 
then x, ' is replaced by 1/2 (22) : +x). 


It will now be shown that each step in this process does result in 
a decrease in the cost, C. Since if the new cost is greater than the old 
cost the step size, xy - xy » is reduced by a factor of 2 it only is 
necessary to show that for a small enough step size there will be a re- 
duction in C. The demonstration will be begun by showing that the change 
in estimate is related to the negative gradient of the cost evaluated at 


x} by a positive definite matrix. 


e601) = -C. G2) (45) 
l %y l 


ger) = Po’ Gejrxy) + HOT RT Tz-hGet)] (46) 


The change in the iterate is 


d= x, x} (47) 
d= xx + 6 T2*-H*x (48) 
d=x - x; y o*tz-noet + Hoey -H' x) (49) 
d= PIG HG -x}) + GTa-h@})] (50) 
Now to show that 
d= (POP HP? + RI? HYP.) g(x) (51) 


it must be shown that 
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1 


ips Ri eso 3 re - 
A) OS US ae ck) tel eee (52) 


and that 


Gta tp py (Hippy + Ry PR (53) 


i 
The first of these is obvious after substituting forG . 


In the latter PH’? is factored from the left side to obtain 
-l wi =] 
Pa (1 - (HP + RY PR. 
and then [HP HT Ra rH HT +R] is substituted for the I above. 


PHT Hp it + RY Hp +R - PTR 
After cancellation, the above is the expression for Gi. 

Note the matrix P =P [HP A Se HP. is the covariance of 
the updated estimate in the linear case so it will be given the special 
symbol 

P, =P, - Pf CH PH +R)” HP (54) 


Oo 


so that (51) can be written 
d= Piste) (55) 
After applying the overshoot control the actual change of the iterate 
is in the direction of d but may have a smaller magnitude. Let D be the 
actual change so that 
D = qd . (56) 


D= qP,8(x}) (57) 
where q is some integral power of (1/2). 
Then to a first-order approximation the change in C, AC, is given by 
ACB 2¢'D (58) 


ACz 2q gtd (59) 
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AC 2q g P18 (60) 


Ac 2q |I¢l P, (61) 


Since all higher order terms in the expansion of AC involve higher powers 
of q it can be asserted that for some small enough q the first order term 
will dominate. Thus for some suitable q the change in C will be negative 
for any non-zero g if P, is a positive definite matrix. Py is known from 
the linear theory to be positive definite for any value wee and any 
positive definite R. This also follows from the fact that 


P} _ Po + yitprlyi (62) 


a convenient matrix identity discussed in fl], and the fact that the ine 
verse of a positive definite matrix is positive definite. 

It has been shown that the special single-stage problem can be solved 
by solving a sequence of simple problems which approximate more and more 
closely the real problem. Each iteration was shown to reduce the cost 
unless the gradient of the cost was zero. 

In the next section it will be shown that the general problem cone 


sidered in this method can be recast in the form of a single-stage prob- 


lem. 
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6. Multi-stage Case Cast in Single stage Form. 

The process of converting a multi-stage estimation problem into a 
single stage problem is accomplished by defining the new state to be the 
juxta position of the states at the various stages. The process will be 
carried out in detail for a two-stage problem and then the process will 
be generalized by induction for a k-stage process. 

Let the new state be 

x(1) 


ees (63) 
*(2) 


x 
Mt 


let the new estimate be 


(1/2) 


ee ore (64) 
(2/2) 


let the a priori for the new state be 
x (1/0) 


= |en---- (65) 
x (2/0) 


Xo 
and let the a priori covariance of the new state be 
Pa, 1/0) | PCs Ay 


SS Pe oe ° (66) 
eae 2/0) | P(2, 2/0) 


O 


Some of the elements in x, and F have not been previously defined. 
However, the notation used has already been defined, i.e., x(2/0) means 
the estimate of the second state given no data. The submatrices in Po: 


are defined as follows: 


P(i, 4/0) = EL Gc(4/0) = x(4VOe(4/0) - #G))"] (67) 


This is a natural extension of the double argument notation alreay defined 
which is necessary to accomodate consideration of the cross correlation 
between states at various times. Since these new elements are not given 


directly in the multi-stage model it will be necessary to fill in these 
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elements in order to define the single-stage model. 
First consider x(2/0). Based upon the requirement that (11) be sat- 
isfied it must be true that 
x(2/0) = Efx(2)] . (68) 
Substituting for x(2) from (1) and taking advantage of (2), the fact that 
w(1) has mean zero, yields 
%(2/0) = GEf[x(Q1)] . (69) 
Introducing (11) above yields 
x(2/0) = (1/0) ° (70) 
If there are deterministic inputs (or equivalently Efw(k)] # 0) the 
appropriate modification is 
x(2/0) = 4x(1/0) + T Efw(1)) . (71) 
Now consider the various submatrices of P.: P(1,1/0) is already 
known in the multi-stage problem as P(1/0). P(1,2/0) is given by 
P(1,2/0) = Ef Ge(1/0) = x(1))Ge(2/0) = x(2))"] (72) 
Substituting (70) and (1) in the last part of (72) yields 
P(1,2/0) = Ef Ge(1/0) - x(1)) Gx (1/0) = @x(1) = Pw1))™} (73) 
Taking advantage of the independeace of w(1) from (2) and factoring 
yields 
P(1,2/0) = EL Ge(1/0) = 2(1))Ge(1/0) = (1) Je" (74) 
but this is just 
P(1,2/0) = Paso . (75) 
By Similar arguments it can be shown that 
P(2,1/0) = @P (1/0) (76) 
and 
P(2,2/0) = @Al/oje’ + rr" . (78) 


The remaining elements needed to complete the description of the single 
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stage model have to do with the observation process. 


Let the actual observations be 


z(1) 
2 apse (79) 


> 
2(2) 
let the vector of observation functions be 


h(v(1)) 
ce $ (80) 


h((2)) 


hG@e) 


let the measurement noise be 
v (1) 
D = |er-e : (81) 
v(2) 


and let the meaSurement noise covariance be 


R} 0 
= | eee (82) 
01 R 


where the off-diagonal elements of A are zero matrices in view of the 
whiteness of the observation noise. 

As the number of stages is increased, the dimension of the single- 
Stage x is also increased. The process of expanding the single-stage 
model of a k stage model to that of a (k + 1) -stage model is 
explained in detail only for xX, xX)» and P.. The expansion process or aug- 
mentation for the remaining elements of the model is as simple as the aug- 
mentation of x will prove to be. The augmentation of x is shown explicitly 
as an example. 


In the augmentation process y goes from 


x x (1) 
= to 
| 2 (k)} x(k) 
x(k + 1) 
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The expansion of x) is only slightly more involved. For the (k + l)- 


Stage case the a priori estimate is 


x (1/0) 


x (83) 
x (k/0) 


x(k + 1/0) 


lM 


where x(k + 1/0) = &&(k/0). 

The structure of the P. corresponding to the (k + l)-stage case is a 

(k + 1) by (k + 1) Square matrix whose elements are submatrices. The 
upper left k by k part of this matrix is already known from the P. assoce 
iated with the k-stage model. Thus to expand to the k + 1 case it is only 
necessary to fill in the lower border. The a priori covariance P has the 
form 


P(1,1/0) me oo PERK 0) 


P(1,k + 1/0) 


pe | (84) 


P(k,1/0) . « ~ P(k,k/0) P(k,k + 1/0) 


P(k + 1,1/0) . . .« P(k + 1,k/0) * P(kK +1, k + 1/0 : 
The formulas below for the new border elements a ae were derived in ex- 


actly the same way the submatrices P(1,2/0), P(2,1/0) and P(2,2/0) were 


found in the case for k+ l= 2. 


P(k + 1,i/0) = @P (k,i/O) ‘for lsi<sk (85) 
BURG AV Oene (te ONe formal <1 =< k (86) 
P(k + 1,k + 1/0) = @P (k,k/0) @ +e (87) 


Thus it is clear that the dynamic relations represented by (1) for the 
multi-stage problem are incorporated into the very special structure of 
the large covariance matrix P in the single-stage equivalent. 


After a given multi-stage problem has been cast in single-stage forn, 
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the single-stage problem is solved using the methods previously described. 
The outcome of the single-stage solution is the best estimate, ¥,, which 
must then be interpreted in terms of the original multi-stage problem. 

The best estimate, x); is the best estimate of all states given all data 
up to the present stage, i.e. 


x(1/k) 


v1 = e e (88) 


x (k/k 
Finally, as each new single-stage solution process is started there 
must be an inital estimate of x) which is called x} . The first iterate 
for a (k + 1)-stage minimization will be based on the final estimate from 


the previous minimization over k stages. Explicitly, the first iterate is 


given by 
x (1/k) 
x; = (89) 
(k/k) 
ne n 1/k) 
where x(k + 1/k) = @x (k/k) (90) 


At this stage in the development of the filter, the problem has been 
solved but in a very impractical way. The filter has been broken into an 
unlimited sequence of minimization problems. Each minimization problem 
is solved through an as yet unlimited Sequence of approximate solutions. 
In order to design a practical filter a realistic convergence test must 
be used to terminate the minimization process. Such a convergence test 


is discussed in the next section. Another difficulty arises in connection 
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with the sequence of minimization problems. As more and more data are 
considered, spanning a larger and larger collection of states, the size 
of the minimization problem increases. A practical means of limiting 

the size (dimensionality) of the minimization problem will be discussed 


in a subsequent section. 
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7. Criteria for Termination of the Minimization Procedure. 

It seems to be a universal feature of any iterative numerical method 
that the termination decision is based upon "feel" or "rules of thumb". 
Typically, a quantity is chosen as a measure of the convergence of the 
iterative process and this measure is compared against a standard or 
threshold. Fortunately, in this particular case it is possible to offer 
some insight into the choice of both the measure and the standard or 
threshold. This is true because the problem is basically a stochastic 
one. By analogy with certain special cases it is possible to approximate 
the probability density function of the measure of interest. 

The control law or algorithm for the control of the number of itera- 
tions is based upon the fact that the minimum cost, Cla)» is itself a 
random variable whose distribution function can be approximated by that 
of a chi square random variable. The minimum cost is exactly distributed 
as a chi square variable when the measurements are linearly related to 
the states and all of the random variables are Gaussian. The chi square 
distribution is characterized by a parameter called the number of degrees 
of freedom. For the least square problem the aumber of degrees of free- 
dom is the number of constraint Gace eins less the mumber of parameters 
adjusted in the process of minimizing the sum of the squares. 

Using this assumed distribution function for the minimum value of C 
it is possible to evaluate numerically the probability of a minimim C being 
greater than some number Cc: Actually the question is reversed so that a 


C, is found such that the probability that a minimum C is greater than C 


L L 


is some small number @. This number C, (@) is then used as a threshold for 
comparison with the actual value of C after each iteration. If C is 


greater than C, the process is reiterated on the assumption that it is 


L 
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very improbable that this value of C is in fact a minimum. 

Clearly using such a test will reject a certain number of true mini- 
mum calues of C. For this small percentage (@) of cases, the stopping 
criteria is based upon the relative change in state estimates. When the 
relative change in each component of the state vector after an iteration 
is less than some small number, EPS, the successive estimates are con- 
sidered to be equal and the process is terminated. 

On the other hand, passing this first test does not assure that a 
minimum has been reached. For this reason, a second test is prescribed 
which specifies a minimum improvement. Choosing the minimum improvement, 
Cu may be considered purely arbitrary. On the other hand it may be 
helpful to invoke a statistical interpretation to aid in choosing Cue 
Such an interpretation exists if all of the random sequences are Gaussian. 
Then c(x') is related to the likelihood ee x and comece is related 
to the likelihood ratio. The likelihood ratio is a common statistic used 
to test the significance of the difference between two estimates. 

The difference is considered to be significant at the B level if the 
probability of occurrence of the observed likelihood ratios is less than 


. i 
B under the assumption that x is the true state. The test of statisti- 


cal significance is made by setting a threshold on the likelihood ratio, 


itl] 
1 


likelihood ratio and has a chi square distribution with the number of de- 


or some function of it. The difference, C(x) )- (x ), is minus twice the 


grees of freedom equal to the number of components in the state, x. Cc. is 


chosen as that value for which the probability of a chi square variable 


a 
less than Co is B. The test is: if c(xt)-c(at a is greater than Co re- 
iterate, otherwise terminate the procedure. The satisfaction of the test 


suggests the interpretation that the last two estimates are not significantly 
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different so no further iteration is carried out. 


The convergence test described may be summarized in terms of three 
quantities involved in the minimization process and three thresholds. 
The three quantities are 1) r, the maximum absolute relative change in 
any component of the state vector from one iteration to the next,2) ch, 
i- i 


the current value of the cost and 3)(C C ), the change in the cost over 


the last iteration. The corresponding three threshold parameters are 


EPS, Cu Ci. The decision rules for terminating or continuing the pro- 


cess are displayed in Figure l. 










Reiterate 





Figure 1. Flow graph of the criteria for iteration termination. 
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As an example, consider a single stage minimization where the state 
has six components and the measurement has four components. The pro- 
bability that a chi square variable having four degrees of freedom will 
have a value greater than 14.9 is 0.005. This suggests that if C2 14.9 
only one true minimum out of 200 will be rejected by this test. The prob- 
ability that a chi square variable with six degrees of freedom will be 
less than 0.872 is 0.01. If C(a)-c(a SC, = .872 the last two iterations 
are not significantly different at a0.99 confidence level. Finally, for 
those unusual cases where the true minimum is greater than 14.9 the 
iterations are continued until the iteration values are the same within 
the limitations of computer word length. For the CDC 1604 the floating 
Operations carry about ten significant digits. It should be considered 
that absolute convergence has been attained when the relative change in 
all of the components of the estimate do not change more than one part in 
107, This indicates a choice for EPS of to" 

The remarks of this section were directed toward providing some in- 
sight into the choice of the threshold parameters. While the assumptions 
which would make these interpretations rigorous may in most cases be 
lacking, the filter designer must incorporate a convergence test, i.e., he 
must choose a set of parameters. The interpretations discussed above are 


offered as an aid toward choosing an efficient set of parameters. 
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8. Criteria for the Number of Smoothing Stages to be Carried. 

As a result of the way in which the best estimate has been defined, 
the complexity of the algorithm increases with the number of stages over 
which data is available. The estimate of state at the initial time is 
the result of a minimization involving n (number of components in the state 
vector) variables. The estimate of the state at the time otf the twelfth 
measurement would be the result of a minimization over 12n variables. 
Clearly, proceeding in this way the computational requirements will ex- 
ceed the capabilities of any computer after a finite number of stages. 

The work of Denham and Pines [5] has focused attention on the dif- 
ference between the case where the meaSurements are linear in the states 
and the more general nonlinear case. This difference was shown to be the 
result of neglecting higher-order terms in the expansion of the measure- 
ment function. The minimization over many stages may be viewed as a means 
of avoiding this problem since each linearization is only tentative. As 
new data become available, providing more information about the old states, 
the linearization of the measurement functions becomes more accurate. 

When the state is known well enough so that the second-order terms are 
negligible, the linearized measurement is considered to be accurate. This 
approach will be developed into a criterion for the number of smoothing 
stages which must be carried in the next minimization process. 

Consider a second-order expansion of a single nonlinear observation 
function about a point x. 

2 = ne?) + bh) Gene) + Ge )7 be?) Gene®) + v (91) 
where hr) is a row vector of partial derivatives of h evaluated at 
x° and he) is the symmetric matrix of second partials of h evaluated 


re) 
ac x. 
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In order to form a synthetic observation, z , of the form of (16), 
the synthetic observation must be a linear function of the true state 


with added noise which has zero mean. 
z° =z - h(x’) + h G2”) ee (92) 
zo > Haetvev' (93) 


where H = hb Ge) and 
b= & EL Gene) "h Ge°) Gene] (94) 

and v' is the variation of the second-order term about its mean,b. 

The added noise v' is considered to be the random noise caused by 
the linearizing process. The magnitude of this nonlinear noise can be 
measured in terms of its variance, A'. Expressions for evaluating b 
and #' are developed in Appendix II. 

The process of dropping a stage of smoothing is a lumping operation. 
It can be seen by examining the equations for the linear filter that all 
past data is brought forward in time through (17) and (21). This is not 
done immediately in the nonlinear filter because the observation equations 
have been linearized at a point which may be quite different from the true 
state. By keeping several stages active in the filter it is possible to 
perform the linearization at a point much closer to the true state. The 
dropping of a stage should be accompanied by a high degree of confidence 
that the last linearization was performed at a point close to the true 
state. That is, H is not going to change significantly as better esti- 
mates of the true state become available. The invariance of H is related 
to the expression for F' = ¥race[h_ Ge°)P] since he) represents the 
variability of H with x and P represents the variance of x. Thus when 


the natural noise in each element of the observation vector for a given 
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stage is an order of magnitude greater than the nonlinear noise, the 
stage corresponding to observation no longer carried. 

The mechanics of lumping are quite straight forward once it has 
been decided that the current linearization is a good approximation. 
Under these conditions, the Kalman sequential filter equations are dir- 
ectly applicable. If these equations are applied to those stages which 
are to be lumped, the result will be an a priori estimate of the first 
state which is not to be lumped. The states which have been lumped no 
longer appear. All of the information that these states carried with 
respect to the estimation of future states is characterized by the a 
priori estimate of the first state which is not lumped. From this point 
then, the problem has exactly the same structure as the problem before 


lumping except that there are fewer stages being carried. 


Ms Computation of the Solution. 

The basic features of the algorithm are shown in Figure 2. The 
following is a brief resume of the quantities which must be computed in 
order to implement the filter. Consider the overshoot decision. In or- 
der to decide whether an overshoot has occurred it will be necessary to 
evaluate the cost, C. For this purpose the multi-stage expression (12) 
is more convenient than the single stage expression (12). To test for 
convergence it is necessary to have the current cost, the previous cost, 
the current estimate, the previous estimate and the two parameters C 


L 


and Cu which are functions of the number of smoothing stages currently 
being carried. The decision on the number of stages to be carried de- 
pends upon an evaluation of the nonlinear noise associated with each 
observation. In order to evaluate this nonlinear noise, the matrix of 
second partial derivatives of the observation functions must be computed 
at the current best estimate of the true state. In addition, there must 


be available a covariance matrix representing the uncertainty of this 


estimate. 


Computation of the Cost 

In general there are many means of computing a given quantity. It 
turns out that the expressions for some quantities are useful in discus- 
sing the problem but are not the most efficient in actual computation. 
Such is the case with the cost C. For discussion purposes the cost was 
expressed in terms of the single-stage state variable. For computing 
the cost at an actual estimate, however, the form of (12) is more effic- 
ient. 


The first term is handled as follows: 


lec1/oy - xa/el|y, = IlBGeG/o) = xa/e II (95) 
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Figure 2. Flow graph of overall procedure. 
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where W, = P¥(1/0) (the pseudo inverse of P(1/0)) and B is chosen so that 
the equality holds which implies that B satisfies 


BB = p* (1/0) ° (96) 


A suitable B is found by a special routine which begins by decomposing 
P(i/0) into the form 
T 
P(1/0) = AA (97) 


where A is an nx r matrix and r is the rank of P(1/0). If r= n then 


P* yoy 2p a/oyand (98) 
ae (99) 

If r s n then 
p*(1/o) = al*at (100) 


which implies that B = at and a® can be computed by ae (ay ~ Sake where 


the indicated inverse is known to exist by construction. That is, ATA is 
r x r and has rank r so it i8 non-singular. The routine which computes A 
also computes fe if it exists, with only minor additional labor. 

For most applications P(1/0) will not be singular even though the 
single-stage covariance will be singular if [ has rank less than the 
system order. It is for this reason that the procedure adopted has a 
built-in flexibility to handle the singular case but handles the non- 
singular case with virtually no loss of computational efficiency over the 
more conventional approach of inverting directly the covariance matrix 
P(1/0). 

Evaluating the typical second term in (12) is accomplished in a 


Similar fashion 


. 2 2 
lae(i/k) - Ge(i-1/) Ih = ||S,x(i/k) - Sox (i-1/k)| 5 (101) 
where Wo = r)*. Since [T’ and @ are assumed to be constant throughout 
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the problem 3) and S. can be computed beforehand. By comparing Cerms 


HG = (Tr)? (102) 
and S,= Sé . (103) 


There is no loss of generality in assuming that [T has full rank, i.e., 


rank equal to its minimum dimension. Under these conditions 
Gaeta (104) 
where ri . ir yt . This implies that 


rh rt (105) 


j— 
it 


and Sy = TD) Te (106) 


where the indicated inverse is known to exist. 


Finally the third typical term of (12) is 


lz (k)-hGe(i/k))|| 7 (107) 
3 . 


where 


The procedure used to evaluate this term assumes that the measurement 
errors are independent, i.e., AR is diagonal. While this is a realistic 
assumption for most real problems there are means similar to those al- 
ready used to handle cases where the observation errors at any time are 
correlated with one another, i.e., RF is not diagonal. The details will 
not be discussed for lack of physical motivation. 

This computation is carried out in the computer subroutine COST. 
The auxiliary matrices B, Ss) and So are computed outside the subroutine. 
The matrix decomposition routine which generates A such that AA! m P(1/0) 


is called DECOMA. 
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The Basic Minimization Procedure 

Another procedure which is computed by a different method than that 
used in the development is the minimization process (43). The computa- 
tional procedure is a step-by-step solution of the linearized single- 
Stage problem. This single-stage problem is viewed as involving a se- 
quence of observations. Each observation is combined in turn to produce 
a new estimate of the state and a new covariance of that state. Figure 
3 shows the details of the step-by-step procedure. At any point in the 
procedure the best estimate, given the data processed, is x and the assoc- 
iated covariance is P. This type of sequential processing is valid under 
the assumption that the observation errors are mutually independent. In 
the computer program this process is carried out with the observations 
divided into blocks according to the time of the obServation. The sub- 


routine KALFIL processes each block in the manner indicated by Figure 3. 





ENTER Next observation = z 
P x : 
oO? Next linear 
— pe HH 
relation 
Next observation 
i —p R 
error variance 
X ——f X 
O T 
Pp ? PH —®G(vector) 


R+ HG —-* B(scalar) 


G/B —& D(vector) 


(® z-Hx —®E(scalar) 


x+ DE —®™x(vector) 
P-cb! —® p(matrix) 






OBSERVATIONS 
PROCESSED? 


l 
P—>P, e 


exit 


Figure 3. Flow graph of step-by-step minimization procedure. 
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This routine is entered once for each block or once for each smoothing 
Stage carried. 
This process yields the useful result that the vector xy after j 
blocks is composed of the sequences of state estimates x(1/j), x(2/j), 
--, “(k/j). Similarly the matrix P is composed of ne submatrices of the 


form 


P(1,1/j) P(1,2/j) . «© PC,k/j) 
m= 1PQ2,1/j)> 3. ms a 
P(k,1l/j) . . » «+ P(k,k/j) 


It is at this point that a lumping operation takes place. If it has been 
decided that all of the observations up to and including the (ia stage 
have negligible nonlinear noise, then a lumping operation is performed. 
This consists of shifting all of the estimates x up and out and shifting 
the P matrix up and to the left. This has the effect of eliminating any 
reference to any state at time j or earlier and reducing the dimension of 


the single-stage state x and the single-stage covariance P. 


x (1/2) 
x (2/2) AFTER x (3/2) SHIFTING x (1/0) 
(3/2) SHEFFERST yyy) TERE INDE To 0) 
x (4/2) 
PQ .1/2)8 PC,2/2) Pd,3/2) Page 
; P(2,1/2) P(2,2/2) P(2,3/2) P(2,4/2) 


-[P@G,1/2) P(3,2/2) PG3,3/2) PG,4/2) 
P(4,1/2) P(4,2/2) P(4,3/2) P(4,4/2) 


AFTER " P(3,3/2) P(3,4/2)) repr £ P(1,1/0) Pp {1,2/0) 
— PG,3/2) P(,46/2- THER P(2,1/0) pP(2,2/0) 


Figure 4 Evolution of the estimate and covariance during the transition 
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The process of shifting in the computer produces an automatic change in 
indices. The process is shown as two step for an example where j = 2, 
k = 4 in Figure 4. The subroutine SHIFT performs the details of shifting 
the matrices as indicated above as well as several other matrices which 


must be shifted. 


ak 


10. A Simple Example. 

Unfortunately examples seem to fall in two mutually exclusive cate- 
gories, enlightening or realistic. The following example is introduced 
to illustrate the mechanical details of the algorithm. This example 
also illustrates the process of abstracting the mathematical model from 
the physical situation. Consider an active, drunken, tight rope walker. 
He is put on a tight rope so that only one coordinate will be needed to 
specify his position which will be designated v(k). A drunken person is 
considered in order to introduce the concept that his position is a ran- 
dom quantity. 

It will be assumed from previous experience with drunken tight rope 
walkers that his next position is different from his last position by 
some completely random variable. The mean squared value of this dif- 
ference is assumed to be known. Further it is assumed that his ramblings 
to the right are balanced in the long run by those to the left, i.e., they 
have zero mean. The mathematical model for this much of the physical 


situation is given below. 


x(kt1) = x(k) + W(k) (108) 
E{w(k)] = 0 (109) 
Lucky oj] = 42 for oe (110) 


For concreteness Q is taken as 0.02. 

The first expression is usually said to be the model of a dynamic 
system excited by white noise. The second and third are quantitative de- 
scriptions of the noise. 

The next part of the situation that must be described is the process 
by which data are obtained. Here the concept of a nonlinear measurement 


is introduced. An angular measurement is made at some fixed distance 
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from the tight rope, i.e., the observer turns his head through a certain 
angle. For simplicity the observer is placed opposite the center of the 
tight rope at a distance of one unit. It will be assumed that the ob- 
server can sense the angular deflection of his head with a standard de- 


viation of 0.1 radians. The mathematical model for the observer is given 


below. 
2(k) = dan HG) + U(k) (111) 
E[v(k)] = 0 (112) 
E[v(k)v(j)] = . 5 ts (113) 


and # has been taken as 0.01. 

Finally thea priori data must be specified. For this example the 
use of an artificiala priori will be illustrated. The physical situation 
is such that before taking any data there is essentially no information 
available about position of the tight rope walker. This fact is model- 
led by taking thea priori estimate as zero and assigning a very large 
variance to this estimate, say 10,000. 


The structure is 
x%(1/0) = 0 (114) 


P(1/0) = 10,000 (115) 
This completes the mathematical model of the physical process underlying 
the observations. Assume that the first two observations are 0.7854 and 
0.900 radians. Using these observations the computations required by 
the filter are described in detail. 

The method described in Section 5 for the single-stage case is 

applied to this example for the first observation. The first estimate 
of the state is the a priori Berrigan acs = 0. The partial derivative 


1 
of the measurement is evaluated to find yt a L/C1+(x4)7] * 1 for the 
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first iteration. Appiving (44) yields 


Gt = (1)-(10,000)/((1)+ (10,000) ><) + (.013] = 1. 
Next zi is computed from (40), 
zi = .7854-tan'(0) + (1)°(0) = 0.7854. 
From (43) 
4 (not an exponent) = 0 + (1)°(7858 - (1,°¢G)) = 0.7854 

In Table I the resuits cof repeating this procedure three times are shown. 

Also shown in the table is the cost associated with each estimate, 
including the cost for the a priori. From a table for the chi square 
variable the threshold values are found to be G, (9.05) = 3.84 and 
C,00.95)= 0.004. Both of these are for a single degree of freedom. C. 
has one degree of freedom since there are two constraints, the a priori 
and the observation; less one adjustable quantity, the single component 
of the state estimate. The Cu also has a single degree of freedom since 
there is only one element in the state wector. From Table I it can be 
seen that the cost associated with xf might be considered a minimum cost, 
but there has been a significant decrease in the cost so the process is 
repeated. Considering the last iteration it maw re said that 1.0 4s not 
a significantly better estimate of the true state tham 9.9767, This 
terminates the first minimization process. 

It is interesting to note that the device of takimg a large a priori 
variance has led to the expected result thet »(1/1) converges rapidly 
to tan [z(1)] = 1.0. it may occur to the reader that this is the hard 
way to evaluate tan [z(1) J, but the advantage of general applicability 
of the method outweighs the advantages of comeidering special cases. In 


any case, the machinery for hendling « oriori imformetion must be avail- 


able in order to implement the lumping procedure. 
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TABLE I 


EXAMPLE OF SINGLE-STAGE MINIMIZATION PROCESS 





A PRIORI ESTIMATE = 0.0 OBSERVATION = .7854 
A PRIORI VARIANCE = 10,000 OBSERVATION 
ERROR = .01 
VARIANCE 








TION [ESTI- 





0.7854 0.7854 








0.6042 0.9767 1.40 









O25 121 1.0000 





55 


Next the bias and variance of the nonlinear noise are evaluated to 
determine whether a lumping operation is indicated. The nonlinear noise 
depends upon the variance of the new estimate and the second derivative 
of the observation function evaluated at the estimate. The variance of 
this estimate is computed from (54): 

Po = 10,000 - (10,000)*/(40,000 + 0.01) = 0.04. 

Using this variance the lumping test criterion can be computed from 
Appendix II. The bias due to the second order terms is 0.01 and the vari- 
ance of the second order term is 0.0001. Comparing this with the variance 
of natural noise (observation errors) it is noted that there is an order 
of magnitude difference and a lumping operation would normally take place. 

For this example the first stage will not be lumped so that the de- 
tails of the two-stage minimization process may be illustrated. If 
lumping had taken place the estimate would be projected forward to form 
the a priori estimate for the next time frame. Since @ = 1 for this 
simple dynamic system the new a priori is just the old best estimate. 

From (21) the variance of the a priori estimate is .06. To obtain the 
estimate of the position of the tight rope walker at the second time 
frame one proceeds exactly as above using the new a priori information. 

In order to solve the two-stage minimization problem the problem 
is reduced to a single-stage problem. The elements of the single-stage 
state are the position at the first time and the position at the second 
time. The relations described in Section 6 are used to generate a com- 


plete single-stage problem having a two-dimensional state. 


(116) 
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10000 10000 
Ps (117) 


. 10v00 10000.02 
0.7854 | 
.o (118) 
0.9000 
01 * © 
R= (119) 
0 01 
= — 
hy (x) tan (x) 
h(x) #=* = (120) 
= 
ho (x) tan (x5) 


The above is a new single-stage minimization problem and conceptually, it 
is solved in exactly the same way that theprevious (one dimensional) pro- 
blem was solved, i.e., by repeated application of (43). As a practical 
matter it is expedient to adjust the estimate separately for each ele- 
ment in 2. This is possible since the errors in the observations are in- 
dependent. This is true in general because the errors were assumed to be 
white in the multi-stage problem. 

Each iteration proceeds in two steps. First both elements of the 
estimate are adjusted for the first element of 2 and then the resulting 
adjusted estimates are adjusted for the second element of 2. See Figure 
3. The result of the first adjustment is already known and need not be 
computed. The first element of this intermediate result is the best 
estimate from the previous minimization process. The second element is 
just the predicted estimate +(2/1) = @x (1/1) = 1.0 from (83). The in- 
termediate covariance is computed from (85), (86), and (87) and known 


value of F(1/1). 


= (121) 
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Laat er ‘. (122) 
0.04 0.06 


Starting from this intermediate result the adjustment to the esti- 
mate for the second measurement is computed. ch is now a row vector of 
partials of the second observation function with respect to both elements 
of the state vector: yt = lo. 0.50 |. Note that the zero in yt is a 
general result of the way in which the problem is formulated. The measure- 
ment function is formally a function of the whole state x although it is 
clear from the construction of the single-stage form that each individual 
element of the measurement function, h(%), is a function of the state of 
the system at only one time. From (44) the adjustment coefficients, er, 
are obtained. 

G ® (123) 
Applying (40) the synthetic observation is found to be 
1 


gles | itan) (1) + CSC eos 


and the adjusted estimates are 


Qi eile Ugly 
PG . 
1.1375 


(124) 


The cost for the intermediate estimate (121) and the above (124) 
estimate, x1, are 1.313 and .552 respectively. Since these are a sum of 
four squared residuals with two adjustable quantities the convergence — 
test parameters are different. C, ¢-05) = 5.99 and C, 0-05) = .103. Based 
on these parameters it may be inferred that the above estimate is signifi- 
cantly better than the intermediate estimate. A second iteration will be 


carried out. 
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The minimization is accomplished in two steps. The first step is to 
adjust the a priori vector for the first observation. The difference be- 
tween this step and the single stage minimization is that the partial 


derivatives are evaluated gees given above in (124). The result of 


1 
this step is comparable to the previous intermediate estimate in (121) 


and (122) 
9 9951 
x (125) 
inter. 9951 
048 048 | 
eee . (126) 
.048 .068 


The second step is to adjust this intermediate estimate for the second 


observation. The partial derivative, 47, is to be evaluated at x, and not 


2 


25 . The result of this second adjustment is 
inter 


1.0978 
(127) 


1.1405 
The cost evaluated at this estimate is .539. The convergence tests in- 
dicates that the last estimate is not significantly better than the 
previous one. This minimization is said to have converged. 

After it has been decided that the process has converged it is neces-~ 
sary to evaluate the second-order terms in the expansion of the nonlinear 
measurement function. For this purpose it is necessary to have the co- 
variance of the last estimate. This covariance is automatically computed 
by the method displayed in Figure 3. 


002096 02096 
P= (128) 


1 
02096 02966 


a0 


From Appendix II the expected value of the second order term, denoted 
as the bias or simply b, is 
- 0048 


0065} ” 


and the variance about this expected value is 


- 000023 ii 


R's ° (130) 
-- 000042 


Assume that only the first of the measurements has negligible second= 


order terms. Then a lumping operation is indicated. This might be ac- 


2 
complished by noting the two-stage interpretation of yx eee and te 
2 x(1/1) 
~ inter. oS 
x%(2/1) 
and 


PQ1,1/1) = P(1, 2/1) 


inter. ™ (132) 


P(2,1/1) P(2,2/1) 
The lower element of os and the lower-right element of P consti- 
inter inter 
tute the a priori information for the state at the second time frame. 
It would be possible to consider this a priori information and the second 
observation as a new problem. 

In the computer program it is inconvenient to store all of the 
intermediate results awaiting a decision on which stages are to be lumped. 
An alternate method for carrying out the lumping operation will be de- 
scribed. Assume as above that it has been decided to lump the first 
stage. The next steps in the filter operation would normally be as fol- 


lows. The third state is predicted using (83) and the covariance matrix 


augmented accordingly using (84). These estimates are adjusted for the 
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third observation. The main minimization procedure is begun. Recall 
that the main minimization procedure is carried out in three steps, an 
adjustment for each observation. After the first of these steps the the 
intermediate result can be interpreted as 

x(1/1) 


ad x(2/1) (133) 
x(3/1) 


~snter. 


and 


Pg lee , 271). FC ,3/T) 


P 


= 
inter. 


P@yAd1) PC2,2/1) P(2,3/1) | . (134) 
2G, bee (',271) PC,3/1) 


At this point the lumping operation is carried out by reducing the dimen- 
sion of the single stage to two (see Figure 2) and storing the lower part 


of (133) and the lower-right part of P ater 3” in the area as- 


x 

inter 
signed to a priori information. The remaining two steps of the main mini- 
mization procedure are then carried out. If the process has not converged 
then the next minimization will only have two steps. 


This completes the description of the operation of the filter for 


this very simple example. 
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ll. Target Tracking. 

It was decided to exercise the scheme on as realistic a problem as 
could be found. The target data (which was fed into the filter) was gen- 
erated by a sophisticated simulator. The target motion is the result of 
maneuver commands generated by the user of the simulation scheme. The 
simulator then computes the motion of the target, and simulates the radar 
returns which that target would generate. The simulated radar is of the 
search type having as available outputs range, range rate and three dir- 
ection cosines at a rate of one frame every two seconds. The simulator 
decides, taking into account the relative position of the target and radar, 
whether a return is received. If a return is received, the simulator out- 
puts a noisy version of the true range, range rate and direction cosines. 
If no return is received the simulator sets a flag in the output data. 

At long ranges the chances of getting a return are relatively small but as 


the range decreases the radar gets returns more and more consistently. 


Forming the Mathematical Model 

It should be noted that this problem, as sketched above, does not fall 
directly into the model which has been assumed in the development of the 
technique. Among the parameters which have not been given in the descrip- 
tion of the problem are @, T, A and even x (the state space). This is 
typical of the way in which a problem is first encountered. What follows 
will be a series of engineering approximations which yield the mathematical 
model. This model forms the basis for the filter design. 

First consider the stochastic dynamic model. The dynamic model may 
be viewed as specifying two features of the problem. The first is a pre- 
diction function. One asks: how would one predict some future state of 


the system given perfect knowledge of the present state? This question in 


62 


fact helps to define the concept of state. The state of the system (for 
filtering purposes) is that collection of current attributes of the system 
which has a bearing on the future of the system. For the aircraft target 
the assumption of straight and level flight leads to an assignment of pos- 
ition and velocity as the states. The prediction function is based upon 
the assumption of constant velocity. Thus the components of the state 
vector are 

x) = north position (miles) 

Xo = north velocity (miles/sec) 

X, = east position (miles) 

Xx, 2 east velocity (miles/sec) 

Xo = down position (miles) 

X%~ = down velocity (miles/sec) 
and the prediction function is linear in the states and of the form 
x(k + 1) ped = @x (k). G€ is the discrete time form of three independent 


double integrators for a sample time of 2 seconds. 


1 2 * NG 0 ' 0 0 
i] i] 

0 1 ' O O's O 0 

=> = = = i — td zz BS aan eae wos ee eg @& 

0 Oo ' 1 Zo me &O 0 
rea) = | | 

0 o ' O 1 ' O 0 

= eS = Ld i = BS ae = mie eae we oe a @®@ 

0 Oo ' O Oi el: 2 
f | 

0 0 * QO 0 ' O 1 


The second feature which the model must provide is a measure of the 
prediction errors, or equivalently [. This implies that the prediction 
errors are random variables made up of several normalized random variables. 

The prediction error covariance is then @ = T r’. For this problem 
T was chosen under the assumption that in each direction there would be a 


Stepewise constant component of acceleration of random amplitude having 
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zero mean and Mahildnce Oo: This implied that [ takes the form 


‘ej 0 0 


On 0 0 


0 o 0 
T=2 : 

0 OF 0 

0 0 om 

0 0 op 


The parameters Ow OF and om must be chosen to account for turbulence and 


pilot maneuvers, which, from the filter's point of view, appear as random 


accelerations. 


On = ~02 
Or = 502 


Secondly consider the observation process. Having chosen the state 
Space x it is straightforward to write the functions h(x) 


h, &) = (se + x05 + x2)? = range (miles) 


z= range rate (miles/sec) 


x 
ee north direction cosine 
Ge? + x? + x2)? 
1 3 5 
ve 
WQeawrrYK(“rKrm TT = east direction cosine 
4 (y? ees 2% : 
te eee S 
5 
h.(¢) = = down direction cosine 
5 (se? K 2 z ¥2)2 
1 3 5 
The noise variance was approximated from considerations related to the 


radar simulator. 
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1.5 0 0 0 0 
6 


0 4.0 x 10. 0 0 0 

‘le 0 1% x*tor? 0 0 
0 0 0 3.1 x tory 0 
0 0 0 0 eeo | 


Finally, the a priori estimate and its associated covariance must be 
specified in order to complete the mathematical model. The fact that the 
first radar return has been received places the target in a certain volume 
in physical space. The position components of the a priori estimate were 
taken as the centoid of that volume and the limits of the volume were taken 
as three standard deviations on either side of the centroid. The a priori 
velocity estimates were based on the assumption that the target was headed 
directly toward the radar (in the negative north direction). The magnitude 
of the velocity was taken as that of a Mach 2 target. The covariance of 
each velocity estimate was assumed to be large compared to the square of 
this velocity. The a priori estimate was taken as: 

80.0 


-0.3 


The a priori covariance was taken as: 
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400 0 0 0 0 0 


0 09 0 0 0 0 

0 0 400 0 0 0 
P(1/0) = 

0 0 0 ~09 0 0 

0 0 0 0 400 0 

0 0 0 0 0 09 


This completes the process of abstracting the physical situation into 
the form of the mathematical model. 

It should be emphasized that the abstraction process must be carried 
out for each physical system that generates a sequence of measurements. If 
the model accurately describes the conditions under which the measurements 
are made then the filter can be expected to yield estimates which are best 
in some sense. Even an accurate model and an optimum filter do not assure 


that the estimates will be adequate for any particular purpose. 


The Algorithm Parameters 

There are three parameters that define the iteration termination cri- 
teria. They are the probability, a, that a minimum cost is greater than a 
given threshold, Ch 3 the level of statistical significance, 8; and the nunm- 
ber of significant digits used by the computer. 

The threshold, Ci: depends upon the number of stages carried (which 
determines the number of degrees of freedom) as well as uponad. There are 
five degrees of freedom in the cost for each stage carried since the system 
dynamics (1) introduce six constraints and the observations (5) introduce 
five constraints and there are but six adjustable parameters (the components 
of the state vector) for each stage carried. Ana of 0.05 was chosen in 
order to have a small but finite number of cases where the minimum cost 
was greater than Che 
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The likelihood-ratio-test threshold, C,,, also depends upon the num- 


M? 
ber of stages carried. The number of degrees of freedom is six for each 
stage carried since that is the number of components in the state vector. 
A significance level of 0.95 was arbitrarily chosen. The thresholds Cc. 


and Cu are shown in Table II. 

The filter was implemented on a CDC 1604 computer. This machine 
carries about ten significant figures. Two successive iterations were 
considered to be equivalent if all of the components of the estimate were 
equal in the first nine significant tigures. 

The lumping criterion is based on a comparison of the covariance of 
the observation errors and the covariance of the nonlinear noise intro- 
duced by the linearization process. For concreteness, the nonlinear 


noise was considered to be negligible when its covariance was less than 


that of the observation errors by a factor of ten. 


Target Tracking Results 

Three target trajectories were filtered. The results were quite 
similar. The target on which the largest number of observations were re- 
ceived will be described in some detail. = 

Figures 5 through 9 are a graphical display ot the tilter operation. 
Figures 5 and 6 show the true target trajectory projected on the NORTH- 
EAST plane and the NORTH-DOWN plane. Superimposed on the true trajectory 
are confidence areas generated by the filter. The boxes are used to pro- 
vide a measure ot the quality ot the estimate. The size and shape ot the 
box is computed from the covariance matrix of the estimate. If the es- 
timation errors were Gaussian with a covariance equal to that computed 
by the tilter the box would have the tollowing interpretation. There is 


an ellipse, centered about the estimate which contains the true state 
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TABLE I1 
ITERATION CONTROL PARAMETERS AS A FUNCTION OF THE NUMBER 


OF STAGES CARRIED FOR @ =.U5 AND B = .95 


NUMBER OF C C 


STAGES CARRIED 


1 1.64 
2 5.23 
3 9.39 
13.85 
5 18.49 
6 23.02 
7 27.86 
8 32.85 
9 37.80 
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Fig. 5. Projection on NORTH-EAST Fig. 6. Projection on NORTH=-DOWN 
plane of true trajectory plane of true trajectory 


with estimates. with estimates. 
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Fig. 7. OCoservation (Q) and estimation (+) errors in 


NORTH coordinate. 
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Fig. 8. Observation (Q) and estimation (+) errors in 


EAST coordinate. 





Fig. 9. Observation @) and estimation (+) errors in 


DOWN coordinate. 
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with probability of 0.63 and whose boundary is a curve of constant prob- 
ability density. The vertices of the box are the extremities of the 
major and minor axes of that ellipse. 

Figures 7, 8, and 9 show the estimation errors and the observation 
errors in each of the position coordinates. In addition the computed 
standard deviation of the estimates is shown as a solid curve. 

In order to illustrate the ability of the algorithm to converge to 
a least-squares estimate the cost is given in Table III as a function of 
the number of iterations and the number of observations received. The 
first cost listed in each row was evaluated at an estimate x' which has 


1 


not been adjusted for the newly received observation. The estimate x, 


for a (ktl)-stage minimization process is given by (89). The first iter- 


by adjusting x, 


Only the partial derivatives associated with this new observation are 


ation yields x for the most recently received observation. 


1 
evaluated for this step. This step is comparable to the simple single- 
stage linearization employed in [4] with such disappointing results. The 
second row indicates the danger of stopping at this point. Subsequent 
iterations reevaluate all of the partial derivatives at the previous 
estimate. All of the observations are reprocessed with estimates start- 
ing from the a priori estimate. 

The cost after a lumping operation is only the sum of squares of 
the residuals associated with stages still carried by the filter. The 
lumping operation occurs in the middle of second iteration because this 
is the first time that the intermediate results needed to form the new 
a priori estimate of the remaining stage become available again after the 
decision to lump has been made. 

The time required to produce a least~squares estimate is tabulated 


in Table IV. The time indicated does not include the time required for 
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TABLE III 
TYPICAL SEQUENCE OF COSTS AS A FUNCTION OF NUMBER 


ITERATIONS AND NUMBER OF OBSERVATIONS PROCESSED 


OBSERVATION C(xt) STAGES 
~ ca 
1 367. 0.57 | 0.13 1 
2 377.8 [218.5 | 42.07% 4 2.07 2 
3 35.4 1.7! 1 lens 3 
4 19:0 sjeal2goe | a2 é 
5 48.7 | 20.99 | 20.99 5 
6 121.6 | 27.36 | 27.36 6 
7 811.9 | 38.28 | 38.06 7 
8 749.9 | 101.3 | 44.16 | 44.16 8 
9 48.51} 44.4 | 9.49% | 9.49 2 
10 12.127 sjeewlO06" |guateons® P45 2 1 


* A lumping operation occurred. 
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TABLE IV 


TYPICAL COMPUTATIONAL TIME REQUIREMENTS FOR 


THE CDC 1604 COMPUTER 





CUMULATIVE jOBSERVATION 
TATION | COMPUTATION; ARRIVAL 


TIME 
SIMULATED 

0.817 0.0 
36.633 34 .00 
48.617 46.00 
52.300 48.00 
65 .633 60.00 
73.266 62 .00 
83./16 66.00 
109.800 78.00 
114.700 80.00 
116.800 82.00 
118.167 86.00 
134.050 110.00 
148.417 130.00 
| 196.017 196.00 
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auxiliary computations performed for diagnostic purposes nor the time 
required to read the data from the magnetic tape. It does include all 
computations inherent in the filter operation such as evaluating the 
cost and covariance of the nonlinear noise. The cumulative running time 
has been adjusted to reflect the fact that the filter cannot begin to 
compute a new estimate until the next observation is available. 

Comparison between Table III and Table IV yields the obvious fact 
that the time required to compute the estimate is highly dependent on 
the number of stages carried. From an analysis of the computations in- 
volved in Figure 3 it can be shown that the computations increase as the 
square Of the number of stages times the system order. The computation 
time depends linearly on the number of observations. 

The operation of the filter indicated for the tenth observation is 
typical of all of the remaining stages in both number of iterations and 
processing time required with the exception of the cases where the minimum 


cost was greater than C This happened 3 times out of 67 observations on 


L’ 
the longest run. In each case additional iterations involved only a single- 
stage. Two or three extra iterations were needed to satisfy the termina- 
tion criterion. The largest relative change in any component of the es- 
timate decreased approximately two orders of magnitude after each itera- 


tion. The average processing time for these three cases was about 1.9 


seconds. 
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12. Conclusions 

An algorithm has been developed for the processing of a sequence of 
noisy, nonlinear observations made on a dynamic system whose state is a 
random function of time. The best estimate of the state of the system 
at each observation time is defined to be the weighted-least-squares es- 
timate. 

This estimate is computed by solving a sequence of linear problems 
which approximate the nonlinear problem more and more closely. A method 
has been developed for automatically determining the number of iterations 
required to compute the least-squares estimate by the above procedure. 

The computation of each estimate is based on a least-squares fit on 
only a finite sequence of past observations. A method has been developed 
for determining the length of this sequence of past observations. The in- 
formation contained in the older observations is carried forward in the 
form of an a priori estimate. 

The radar tracking problem is an example of the type of problem which 
falls within the scope of this investigation. The algorithm was implemented 
on a digital computer and used to process a sequence of observations pro- 
vided by a realistic radar-target simulator. The estimation errors were 
generally within the expected range, considering the randomness of the 
dynamic system and the observation errors. The algorithm achieved the 
least-squares estimate in three or four iterations. The length of the 
sequence of observations on which the least-squares fit was based, rapidly 
settled to only a single previous observation. The computational require- 
ments appear excessive when compared with those associated with linear 
observations. There are, however, no other generally applicable methods 
when the observations are nonlinear and there is little prior information 


about the state of the system. 
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Appendix I 
Discussion of singularity of Po. 
The covariance matrix P is assumed to be non-singular throughout 
the development, however, there is a meaningful interpretation for the 
case where P. is singular. It will be shown that the filtering equations 
are still valid in view of this interpretation and that in the one 
instance where the inverse of Po is required (in evaluating C) the use of 
the pseudo inverse is appropriate. 
This development begins by defining a new set of state variables, y, 
so that the errors in the a priori estimates are uncorrelated. 
y= Ww c1) 
V7 We, (2) 
where U is a unitary matrix such that uv} = U and 
UP U’ = D (3) 
where D is a diagonal matrix. 
It will be shown now that D is the covariance of the a priori 
estimates in the y states, 
cov Y) = HUY.) Wy) 3 
cov (y) @ E[UG@-x.) G@-x,)'U') 
Cov (y.) = VEL Ge-x,) Gene.) "Ju" 
Cov Y,) = D (4) 
If f is singular then D has at least one zero on the diagonal or, 
to be more specific, the rank Or SG ae equal to the rank of D which is 
the number of non-zero elements along the diagonal of D. There is no 
loss in generality in assuming that all of the non-zero elements of D are 


in the upper part of the diagonal. The upper elements of Y are 
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conventional statistical estimates of the corresponding elements of the 
true state y, having a variance given by the element in D. On the other 
hand, the lower elements of y, are precise or exact estimates of the 
corresponding true state components and have no variance. When these 
interpretations are reflected back to the xy states the meaning of a 
singular P becomes clear. A singular P implies that a certain number 
of linear combinations of the states are known exactly. 

It will now be shown that the filtering equations, used repeatedly 
in the minimization process, produce estimates consistent with these 
interpretations. That is, the estimate of those linear combinations of 
the states which were known exactly before adjustment for observed data 
are not affected by the adjustment process. Further the covariance 
matrix, P) of the adjusted estimates reflects the fact that these linear 
combinations are known exactly. This demonstration will be carried out 
using the y state space. The filter equations are transformed to operate 
on the y coordinates. 

Consider first 

xs x te DRONE, ee anal (5) 
l Oo re) o Oo 
substituting y for x in (5) yields 
Uy, = Uy, + PATH +R)" [z - Huy), (6) 
pre-multiplying both sides of (6) yields 
ypu, +t WHE + RY (2 - Hy), (7) 
substituting u'pu for P . from (3) 


Y= V, + DU" (HU'DUY’ +R] fz - Huy), (8) 


tl 


and finally making a change of variables x Hut yields the filtering 


equation, 


¥) = Vey orem? +R)" [z 7 ky J 3 (9) 
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in the y states. Since the lower elements of BD are zero it is clear Chat 
there is no change in these components of the adjusted estimate, Y, » 
after application of (9). 
Now consider the covariance equation 
T T =1 
Pas Fa alee aes 
P, = U'pU = U'DUH[HU' Duy + RJ HU'DU 
T T T -l 
P, =U [(D- Oo [kor +R] xplu (10) 
multiplying in front by U and in back by uw 
T T el 
UP U =D- DK (Kur +) KD ne} 
Thus P, reflects the fact that those linear combinations of x which 
were known exactly are still known exactly. 


In the expression for the cost consider only the first term, 
20, - x,ll* » where W, was assumed to be the inverse of P. if it existed. 
1 
Substituting in the y states yields 


2 T 2 
lle, “ly, = |lU@, - Why, 


e) 


2 2 
, 20, > “ly, ss IV, 4 rll yw ut 
Defining @ = uwur and substituting above yields 
2 2 
[oc ss lh, = ly, 7 Vyllg 


Define y = VY, = y ) and partition y into two subvectors 
aly y 
Y= a = bs 
Uy 0 


where the lower subvector y, = 0 from (9) 
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Let Q be partitioned as 


where D is the upper, non-zero, part of D. Thus sum of the residuals 


can be expressed as 

2 2 

lo, elk = lv, | el 
1 D 
Uu 

The blank submatrices in Q@ above are immaterial since they will be 
multiplied by Y, = 0. Arbitrarily assigning zero submatrices to the 
blanks implies that those residuals which are known to be zero are given 


zero weight. This leads to 


pt 6 
Qaann = = uW,U 
0 0 


and premultiplying by ut and postmultiplying by U yields 


" pt o 
Wi = U u U 
0 0 


but this is just an expression for the pseudo inverse of Pf . 
Oo 
Thus if the pseudo inverse of Po is used in the definition of the 
cost there will be no weighting of the residuals in certain linear 


combinations of the states. On the other hand, if the x, is always 


l 
computed using (5) or one of its derivatives then these particular 
residuals will always be zero. 


During the discussion of the minimization algorithm the non- 


singularity p. was an impOrtant assumption. The discussion is valid for 


81 


the case of P, singular in the sense that a new minimization problem can 
be defined in terms of Uy)» taking ¥y Vy The discussion then implies 
that the change in the estimate of Y,, has a positive component in the 
direction of the negative gradient of C with respect Y When these 
conclusions are reflected back to the x state space it can be seen that 
the change in the estimate is related to the projection of the gradient 
of C into that subspace of the state space about which there is some 


uncertainty and for which an adjustment in the estimate is meaningful. 
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Appendix II 


Consider a random variable 
vax ar (1) 
where A is a symmetric matrix and x is a Gaussign random variable 
with mean zero and covariance P . It is desired to find an expression 
for the first and second moment of the random variable Uv. The 
development begins by making a change of variables 
x= BUY (2) 
where B is a decomposition of FP such that P = BB, U is a unitary matrix 
as yet unspecified, such that ut = I and y is a random vector of zero 
mean and identity covariance. The random variable VD is expressed in 
terms of y by substituting (2) in (1). 
V = yy! B ABUy (3) 
Now let U be chosen so that 
u'B ABU = D (4) 
where D is a diagonal matrix. Substituting (4) in (3) yields 
vey ly (5) 


or Ucan be expressed in terms of the components of y 
2 
v 6 
= Days (6) 


To evaluate the first moment of U the order of summation and 


expectation is interchanged. 


fy] = ECB divi) 


E[v ] = p4, ELys] 


83 


The components of y all have unit variance. 


Efu] «= d, (7) 
i 


Expressing (7) in matrix form yields 
E[v] = trD (7') 
The second moment is evaluated by expanding y? in terms of the 


components of yp. 
Z 2 2 
oe] = BE aye ay] 


Elv7] = EIT dey? | 1954; v4] 
i id j 


2 2 4 
Eto] = + djEly,] + © 4,4 jy] 
‘ee id; i? j (8) 


The first term is evaluated by recalling that the fourth central 
moment of a unit-variance Gaussian random variable is 3. Each 
element of the second term can be factored due to the independence 


of the components of y. 


Ely?) = 


a. 
Rm RO 


Pdi +E dd ely) Bly sl 
i id j 
Ev 7 - 3> a’ +7 did, 
i i? j 
2d ead a 
i i 


Ef v7] 


Substituting (7) above yields 


E(v"] = 2 dt + (Efv])7 (9) 
i 


This implies that the variance of v is 


var[v] = 2 di (10) 
i 


or in matrix form 


Var[v | = 2trfD*). (10') 
The results, (7) and (10), are expressed in terms of the parameters 


of the original problem. Substituting (4) in (7') yields 


Efv] = trf uta BU j. 


Taking advantage of the fact that trfAB] = tr{ BA] yields 
E[(v] = tr{A BU u's. ] 
and cancelling the unitary matrices yields 


Efv| = trf AP]. (11) 


The development for the variance procedes along analogous lines. 


iA BU up ta BU | 


a 


Varfv] = 2tr{ ut 
Varfv] = trf{A B BAB B] 
Var{v] = trf{ APAP] (12) 
It is possible to consider the covariance of two random variables 
as follows 
vex Ax (1) 
and a second random variable 
v' = x A'x (1') 
The means of both of these random variables are known from (11) and an 
expession will be developed for the expected value of the product. After 
making the same change of variables as before, the expected value of the 
product can be expressed as 
E[v'v] = Etytc y yD y) (13) 
where 
c= upta's u (14) 


This is equivalent to 


E[v'v] = tr(o Ely y'Dy y'j) (15) 
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Examining only the elements of the matrix inside the expected value 


operator, it is noted that the middle tern, y'D y, is a scalar. 


T 
yDy=Zdy, (16) 

i 

Thus, the elements of the matrix are 
BTY,Y, E44.) (17) 


This expected value is zero for i#j and for i*j it can be written as 


J 


2 2 4 2 
Ely, DCdy)=eldy,J+eEly =F d 
i, kk ivi kAisiK 


ELy S dv) Pi Sdivedt Dad 
ixk 


k 
Ely* od.) = 24, +54 (18) 
1 kk i k 
k k 
Thus, the expected value is a diagonal matrix of the following form 
elyy"byy"] = 2D + (x d)°1 
k 
Substituting in (15) yields 
E(v v) = er c(2D + (E d 14 
k 
which can further be simplified to 
E[v v] = 2er [cp] + (Dad, Jerc (19) 
k 
Considering only the factors of the last term, it is noted that 
ode E[v] (21) 
k 


and that 
tr c = tr(u's {a BU] 


trc =tr[AaP] 
tr C= E[v ] (22) 


Now the first term can be identified with the covariance from the general 


expression 
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Cov (yx, = eCx,»,J - Ex, J E(x, J (23) 


cov[v'v] = 2tr(c D] 


Cov[v'v] = 2tr(u'B a's U D) 


Cov(v'v] = 2tr{a'B U D uta] 


Cov [v'v] = 2tr(A'B U u'pta BU u'B'] 


Cov({ v'v] = 2tr[a'P A P] (24) 


The useful results of this analysis are (11) and (24) since (12) is 
only a special case of (24). Since these results will be used as a 
guide even when the random variable + is not known to be Gaussian it is 
worth reviewing just where the assumptions were used. The factor 2 
which appears in (24) comes directly from Gaussian assumption (i.e., the 
fourth central moment is 3 times the second central moment squared). A 
second result of the Gaussian assumption is that the new random variables 
y are statistically independent (used in (8) and (18). The variables 
Y are uncorrelated (since D is diagonal) but only in the Gaussian case 
does this imply independence. The effect of this assumption is diffi- 
cult to evaluate. On the other hand (8) does not depend upon the 


Gaussian assumption. 
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Appendix III 


Program Listings 
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JHYSH SLYVLS dOOT Y¥311135 NIVW 
SS9OVLiS TV1V YSAO LSOD ¢( SATLVIAWNI)IWLOL 3SHL 3SYOLS OL G3SN SI (2*T)xX 
*0='\¢c*L)xX 
OO& INIUd 
XVWSEWNSSN¢6%7 SJdVLl JSLIYM 
O=VOr 
XVWS9TT INIUd 
G#XVW=XVW 
XVW *€ SdVl av3sY 
€ GNIMSY 
°"O=(TST)X 
LSIHS WIVO $ O=FNI1$°O0=TLSOD 
(T*L)OotX=(T*1)2x 4 
(FSTIOTd=(F SIT) Ed GS 
SN*t=r ¢ 00 $ SN*T=I +» O0S T=) 
°"O=AVOSS 
O=T1I4N 
LINI WV) 
SSAISVIYVA SZIIVILINI 
OO€ ILNI&d 
J9Vd V dIns 
OQV3y¥ SNILNOYGNS HONOYHL SSIDIYLVW NOISSO LNdNI 
Gv3uy W1V) 
67 OGNIMSY 
(THT) LVWYOS OOE 
O0e ILNIYd 
(OT) ASNS (OT) YNS(0T*S 848 I0DG4(0T*84*8) EVES 
(OT*B)YNYS 4 
CLSODS TLSODS XVWASNTSELOOHS 149 € 
SANS XVWSWNSMNSSNS (OB SOB IGdS(OTSB)CXS(OTSB) Txt (OTSB8)VZ*(O1S8*8) THEc 
(OTSBITZS(8SB8)STISGS (OT SSS B)HE(OTS8) ZS(8)YS# (BS B)OSS(OTSEBS) XS (B)OTAST 
(S*B)OTSS(BIYS(O0BSOBIOTdS (OTSB)OTX* (848)0*(8*8)1390*(848)IHd NOWWOD 


NOY WVe90dd 


2 


» 
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( SNt+EI1*°VI)Gd 
(SN+9EI°VI)OTd= 


°OT- ((4*2)2)4S80 = YO 
VLVO GIIVA 804 SBWV44 LSSL S 


(TAS TIOTX=(44S1)0TX 


(SN+GI*SSNtD1)8d =(GISDI)Gd 25 
(SN+tEGI*SNtDI)0Td= (GI*DI)0Td 
=(VISGI) Gd=(GI*VI) Gd €S 


(VI*SGIJOTd=(GI*VI)OTd 
I+SNxe( 2-7) =VI 
A*zc=1 €S OA 
C+SNe(T-X) =GI 
SN*T=F 2S OA 
I+SNx(T-) =I 
SN*T=I TS OG 
( ¢CX* @d) LNSWONV WV) 
(OTX*OTd) LNSIWONVY WIVWD 
T+( A) ASN=(X) HSN 
JWVYS STIHL VLVG ON 3) 
SNNILNOD 64 
67°O€ *6Hn(V9OF) SI 
JNNILNOD [2 
T2*O02*O02(V9) JAI 


90 


(€Il €d3SS3450u8d SL3IS viVd 40 YSEWNN HOE )LYWHOS OTT 


COL°9LO9/ (N£ 2) Z= he 2) 7 


DAS Yad SAIIW OL JLVY SONVY LYZANOD D, 


(G1*Z=1*v)S(nS72)2 
(GIS Z=1*V)S (4ST) 2 
(GIS z=I1 SV) *(4SS)2 
(GIS Z=18V) 8 (AS H)Z 
(GI*Z=I1SV) 8 (ASE) Z 


Z9*° (XVWLI* Z=1*¥) ¢ oe 


6 
6 
6 
6 


ea rd edt 


(GIS Tare(irsT=Isv)) ©€ 3dVL GV3Hy 


GI*e sdvi avay 


(XVWLISC=1SV) § XD *tmedvale Ovex 


LF SXVWLISe ddVLl OV34 
(T)Y=(NST) UY CE 
SN*T=I Ce OA 
XVWSTHWX OC OC 


JANILNOD 22 


G3SWNSSV T3AdCOW 3SHL HLIM V1IVG G3SA3ZID3Y SHL JO » 
ADNVISISNOD WIVWYSAO SHL SLVATVAS OL G3SN SI SIH1L *NOITLNEGILSIA B, 
JVEVIYVA SYxYVNOS IHD SHL OL WOILVWIXOUddVY TIVWYON SHL 3SLAdWOD 2, 
((OVe°6)/ °C) SLYOS/ ( (0 °629OV)/°Z + TEV / LLY Xie aa ery 
(ZJST)X-9°G x9OV=9V 
A-VIAN =9V 
(AVOST 
S~Zwex(ZO—( NSS) ZX) +Zuw (AD—( SE) ZK) +72 4H ( XO— (9° T) ZX) ) T 


*#¥NIVOtAVOSS=AVOSS LZ 
V/*®T=HNIVS S$ TISN=V 92 
bel 2* 9c (0 ea Naa 
T+ 11 43N= 1I4N 
SSISVIYVA NOILISOd JO S3YVNOS JO WNS G3SHLOOWS S3LVYSNS9 2 
SSWVYS N3Ll JO LNVLISNOD SWIL HLIM YSLTI4S SSVd MOT V AG SI ONIHLOOWS 2 
ALTIVVNO ae3aslitIa JO 3SYNSAW BATILININI NV SV G3SN SI SIHL » 


T= (T+) HSN 
JWIL W1V9 
CNNY V1V) 
TNAY VIVD 
ViVd demi 
JWIl W1V5 
T=VOr 
JNNILNOD TOS 
GQ3SS350YU¥d SIWVYS JO YAEWNN ®) 
JH1 3S7O0SNOD SJSHL WOYUS TIOYLNOD Ol YOLVYSdO SMOTIV 2 
00S*TOS*00S ((2)4MSI)41 
JANILNOD #2 


72° UC * Leal Sa eae 
JONVGY JZAILVOSN HLIM S3SWVYS JYONDI 2 
SJANILNOD O02 


OF OF 02 
(T+H°1T)2X=(0S1) 2X TS 
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OO& INIUd €£0E 
COE SCOES ZOE (T+I-DIe(DT/(T=H1I)))41 
TIAN‘ T=I Toe od 
647 sdV1l QV3Xy 
67 3dV1l QvV3u 


VLVG ONLNOD 39vd > 
T+31=97 
(T+O€/TI4N)/1I4N =97 
64 GNIM3Y 
ANNILNOD Sly 
AYWWWNS YZLNIYd > 
ANNILNOD 00S 
64 3J114 GNI 
AINNILNOD O€ 
14IHS W1VD 
(T)LSOD + (T*&T)X=(1T*T)X 
(T*THSTZ*IX)X LV4 WIVD 
Ts 
JOVYOLS 31@1IVAV 4O XDV7 NO G3SVE * NOILVY3d0 ONIdwnt > 


SONNELNOD €2 
Coco. OC (Olay eal Tec 


VOF+ I= 
WASOTT LNIYd 
29° AD*XD *64 Sdvl 3LlixM 
(NVSTV=1S(N1617=1601°1)Gd))*6% JdvVLl 3LIYM 
T+SN-N1=11 
SN*¥A=N1 
(9°T=I1S(4S1)2X) $67 JdVL JLIUM 
(S*T=19CNST)2Z) $64 3SdVl SLIUM 


NO fe MOSS ASWAS TIAN $67 JdV1l S3LIUYM 
AYVWWNS SS3LTI4 YyOd NOITLVWHOSNI SYOLS 3, 
AINO VIVO HLIM S3WVYS NO 2 


AVIdSTG WOTHdWYS YOS SSLVWILS3 G3SYNRFLII4 NO NOILVWHOSNI LAdLNO 2) 
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NV°6%7 3dVl QV4aa 

JSOZ* 42S 50QAS SAS 50X*4XS6H JAdVL AVS 
VUG*O030*ONG*O0NNS ONS 64 JdVl AQV3Y 
LASVNS6% JAdVl QV3u 

6¥esdVl QVv3auy 


6% sdVl QV5yY SOE 
YSdO W4ed Il ied HS€e 
LUILAHS7X9 C 
J//* AWITLHOX92 
* NMOOQHS*SX?2TS LSVSHSXZT*SHLYONHSXZ27/ 3 S3ANISOD NOILDSYIGHLI* xX9LTI 
*3JLVY JONVYHOTXITSSONVYHS SXT TS /SAXYVWWNS JTIGVAYNSSGOHST*XLE)LVYWYOS ZZ 
ect iNIdd 
OO& LNIYd 90¢ 
SO€*90ESGOE(LS3SLIDGAI 
I+] -DIx(DI/(T-1))= LS351L1 
TOYLNOD 39Vd JYOW 
TIANST=I vOe OF 
6% jdVl dv3su 
6% ddVl QV3Y S$ 64 GNIMSY 
(/*©9TSe*9deS2°SGS9STP9OSE*S XS) LVWYOS O2T 
PAS aAgZ © aC A sOxX © 00d" 29* 32° 03S0* A9* SA SONQ* X9O*%4X* ORT BNIAd Loe 
GQS3AY¥SSHO ONY SSNNL © GSNSLTIS xe¥SSLYNIGQHYOOD NVISSLYVD LNIYd 
04¥x00G=00G 
C#LA=LA S$ OYxXOIG=0350 
OYxXONQ=ONQ 
Z9*AD*X9O° 69 AdVl~gvsa4a 
NVS67 JdVl QV5Ua 


(//7/* aSdO avsy Lid 


* 8Sd0 Wad lie asd0 Lala YSgO 1V3ua 


4OZ° 4572S S0AS SAS S0XSSXS64H JdVL OVS 
000*03SGSONGS0NNS ONS 64% AdV1l dV3Y 
LASVYNS 6% SJdV1l QVvsuy 

6% 3sdVl dVay 


60. Jat OV 8d <0t 
(//*§ NMOQ %ILSV3 HILYON L115 W3d LITSHZe*2 
YSdO Ws LIS Y¥SdO WSN LITSHCEXL//*S3SWILHYX6*SS3ZILIDOTSAHOI X61 
*NMOQHYXT TS LSVSHYXCTSHLYONHSXTTS//SAYVWAWNS JSLVISHETXZE)LVWYHOS IZI 
Tél INIdd 
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C/SSOTSE*GIESXIEEPGHEXT HE PGAESKXZFEEMGIZSXZET*9SE*XG) LVWYHOS E2T 
14° 00d § 940 § 30G*030*530£ 43G*ONG*ONGS SINGS ONN*S SUNS ONsONS SNSEZT ININd OE 
JLVY JONVY 


*G3AYN5SYO OGNV 


® 


VLIVGQ ADNSLSISNOD GNV ONIWIL 


(e*°OLs2)LVWHOS OST 


NV°S6%7 3dV1l Qv3uy 
NV*S6% J3dVl QvV3uy 
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